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Abstract:  We  consider  an  integer  stochastic  knapsack  problem  (SKP)  where 
the  weight  of  each  item  is  deterministic,  but  the  vector  of  returns  for  the  items 
is  random  with  known  distribution.  The  objective  is  to  maximize  the  probabil¬ 
ity  that  a  total  return  threshold  is  met  or  exceeded.  We  study  several  solution 
approaches.  Exact  procedures,  based  on  dynamic  programming  (DP)  and  inte¬ 
ger  programming  (IP),  are  developed  for  returns  that  are  independent  normal 
random  variables  with  integral  means  and  variances.  Computation  indicates 
that  the  DP  is  significantly  faster  the  most  efficient  algorithm  to  date.  The  IP 
is  less  efficient,  but  is  applicable  to  more  general  stochastic  IPs  with  indepen¬ 
dent  normal  returns.  We  also  develop  a  Monte  Carlo  approximation  procedure 
to  solve  SKPs  with  general  distributions  on  the  random  returns.  This  method 
utilizes  upper-  and  lower-bound  estimators  on  the  true  optimal  solution  value 
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in  order  to  construct  a  confidence  interval  on  the  optimality  gap  of  a  candidate 
solution. 


1  INTRODUCTION 


Consider  the  following  stochastic  integer  programming  problem  with  random 
objective  function, 

max  P  (rx  >  c) 

s.t.  Ax  <  b  (1) 

x  G  Z“, 

where  Z+  is  the  set  of  non-negative  integer  KT-vectors,  Ax  <  b  are  deterministic 
constraints,  c  is  a  deterministic  “return  threshold”  but  r  =  (n,  r2, . .  is  a 

random  vector  with  known  distribution.  The  problem  is  to  select  an  optimal 
x,  denoted  x*,  which  maximizes  the  probability  that  the  return  rx  meets  or 
exceeds  threshold  c. 

The  stochastic  knapsack  problem  (SKP)  is  a  special  case  of  (1)  that  may  be 
formulated  as  follows: 


max 

X 


S.t. 


/  K 

p  EE  T’kl^kl 

\k= 1  ieck 


K 

EE  WkXki  <  w 

k= i  ieck 


xki  G  {0,1}  Vk,l£Ck. 


(2) 


Here,  Y2ieck  Xkl  *s  the  number  of  items  of  type  k  to  include  in  the  knapsack, 
and  \Ck\  is  an  upper  bound  on  this  value.  The  deterministic  weight  of  each 
item  is  wk  >  0  and  W  is  the  known  weight  capacity  of  the  knapsack.  The 
returns  rk i, . . . ,  rk\ck\  for  a  specific  item  type  k  are  identically  distributed. 

The  dependence  structure  of  the  returns  rki  is  clearly  an  important  modeling 
consideration.  The  variants  of  the  integer  SKP  addressed  in  Steinberg  and 
Parks  [24],  Sniedovich  [23],  Henig  [11],  and  Carraway  et  al.  [4]  have  returns 
that  are  normal  random  variables  which  are  independent  both  between  item 
types  and  within  an  item  type.  Independence  within  an  item  type  means 
that  rk i, •  •  •  ,Tk\ck\  are  mutually  independent  random  variables  for  each  k.  In 
some  systems  this  assumption  is  reasonable:  For  example,  if  we  are  purchasing 
production  equipment  in  an  attempt  to  satisfy  a  certain  threshold  production 
level  and  if  machines  fail  independently,  it  may  be  appropriate  to  model  the 
production  rates  of  multiple  machines  of  the  same  type  as  independent  random 
variables.  On  the  other  hand,  realizations  of  the  returns  on  multiple  financial 
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instruments  (e.g.,  stocks,  bonds)  of  the  same  type  are  typically  identical.  In 
this  latter  case,  and  under  the  assumption  that  \Ck\  is  limited  only  by  the 
weight  capacity  of  the  knapsack,  (2)  can  be  simplified  to 


max 

X 


S.t. 


K 

^ ~2wkXk  <  W 
k= 1 


Xk  C  Z+  V  k 


(3) 


where  rk  =  rki  =  rk2  =  ■  ■  ■  =  rk\Ck\,  wpl. 

Sniedovich  [23]  and  Henig  [11]  discuss  various  optimality  criteria  for  integer 
SKPs,  and  Prekopa  [20,  pp.  243-247]  describes  methods  of  handling  random 
objective  functions  in  stochastic  programs.  Under  the  assumption  of  normally 
distributed  coefficients,  Greenberg  [10],  Ishii  and  Nishida  [12],  and  Morita  et  al. 
[16]  examine  SKPs  with  continuous  decision  variables.  There  is  a  separate  liter¬ 
ature  regarding  on-line  stochastic  knapsack  problems  which  have  applications 
in  telecommunications;  see,  for  example,  Chiu  et  al.  [5],  Gavious  and  Rosberg 
[8],  Marchetti-Spaccamela  and  Vercellis  [15],  Papastavrou  et  al.  [19],  Ross  [21], 
and  Ross  and  Tsang  [22].  While  there  are  many  variants  of  on-line  SKPs,  all 
have  the  property  that  items  arrive  over  time  and  must  be  accepted  or  rejected 
upon  arrival  without  knowing  what  items  will  be  available  for  consideration  in 
the  future.  In  this  paper  we  restrict  our  attention  to  (2),  a  “static”  SKP. 

In  Section  2,  we  discuss  the  special  case  of  the  SKP  in  which  the  returns  are 
normal  random  variables  that  are  independent  both  between  and  within  item 
types,  i.e.,  model  (2)  with  returns  ru  being  mutually  independent  for  all  l  and 
k.  The  returns  within  a  type  are  identically  distributed  and  are  assumed  to 
have  integral  mean  /i^  =  Erk i  and  integral  variance  vk  =  var  rki  >  0.  Sec¬ 
tion  2  derives  a  simple  dynamic-programming-based  algorithm  for  this  problem, 
demonstrates  the  algorithm’s  computational  effectiveness,  and  then  proposes 
and  illustrates  the  viability  of  integer  programming  methods  for  solving  both 
the  SKP  and  model  (1)  which  may  have  general  linear  constraints.  (In  the 
rest  of  the  paper,  “DP”  will  mean  “dynamic  program”  or  “dynamic  program¬ 
ming,”  and  “IP”  will  mean  “integer  program”  or  “integer  programming.”)  In 
Section  3,  we  consider  the  case  where  the  returns  are  governed  by  general  dis¬ 
tributions  that  can  have  arbitrary  dependency  structures  both  between  and 
within  item  types.  For  such  problems,  we  apply  a  Monte  Carlo  procedure  that 
finds  a  feasible  candidate  solution  x  and  constructs  confidence  intervals  on  its 
optimality  gap,  P  (rx*  >  c)  —  P  (rx  >  c). 
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2  SKP  WITH  INDEPENDENT  NORMAL  RETURN  DISTRIBUTIONS 

Let  ru  ~  N(nk,Vk),  where  N(pk,  Vk)  is  a  normal  random  variable  with  integral 
mean  pk  and  integral  variance  Vk,  and  assume  that  all  r*,/,  k  =  1, . . . ,  K ,  l  G  Lk, 
are  independent,  i.e. ,  the  returns  are  independent  both  between  and  within  item 
types.  Let  /z  =  (/ti, . . . ,  pk)  and  v  =  (vi, . . . ,  vk)-  Under  these  assumptions, 


P 


"  K 

EE  rkl%kl  ^  C 

-k= 1  Z  G  £  fc 


=  p 


N( 0,1)  > 


c  Sfc=i  k’kXki 

(Efc=l  Yhl£Ck  Vkxkl^j 


=  P  [JV(0, 1)  >  (c  -  /zx)/-v/vx]  , 


where  Xk  =  YhieCk  Xkl  an<^  x  =  (Xl>  •  •  •  >  )T  7^  0.  We  can  therefore  maximize 
the  probability  of  exceeding  the  return  threshold,  subject  to  x  £  X  =  {x  : 
Ax  >  b,x  G  Z+},  by  solving 


p*  =  min  (c  —  /zx)  /  \/vx 

(4) 

s.t.  x  G  X 


provided  x*  ^  0.  This  condition  is  assumed  to  hold  throughout  this  section 
since  the  possibility  that  x*  =  0  is  a  simple  special  case  to  check.  For  the 
stochastic  knapsack  problem  with  normal  returns,  (4)  specializes  to 

p*(W)=  min  (c  — /zx^-y/vS c 

SKP(fU)  s.t.  wx  <  W  (5) 

x  G  Zf. 

A  standard  way  of  attacking  (4)  and  (5),  e.g.,  Henig  [11],  due  in  concept 
to  Geoffrion  [9],  involves  solving  minxejs:(A/z  +  (1  —  A)v)x  multiple  times  for 
different  values  of  A  between  0  and  1.  However,  the  method  is  not  guaranteed 
to  achieve  an  optimal  solution  when  p*  >  0,  i.e.,  when  P( rx*  >  c)  <  1/2 
[11].  Carraway  et  al.  [4]  use  another  solution  for  SKP(W),  one  that  is  based 
on  “generalized  dynamic  programming”  [2] .  Generalized  DP  maintains  a  set  of 
partial  solutions  for  each  state  of  the  knapsack  (amount  of  capacity  consumed): 
These  partial  solutions  are  ones  that  might  be  extended  to  an  optimal  solution. 
(Standard  DP  maintains  only  a  single  solution  for  each  state.)  The  generalized 
technique  requires  that  specialized  bounds  be  computed  to  eliminate  partial 
solutions  by  proving  that  they  cannot  be  extended  to  an  optimal  solution. 
In  Section  2.1,  we  develop  a  DP  procedure  for  solving  SKP(IU)  that  is  much 
simpler  in  concept  than  the  methods  described  above  and  is  guaranteed  to  yield 
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an  optimal  solution  in  all  cases.  In  Section  2.2  we  show  how  IP  techniques  may 
be  used  to  solve  SKP(TP)  and  the  more  general  problem  (4).  While  the  IP 
approach  is  less  efficient  than  the  DP  procedure,  it  can  still  be  used  to  solve 
SKP(W)  effectively  and  it  has  the  advantage  that  any  type  of  linear  constraints 
can  be  incorporated  in  the  model. 

2.1  Dynamic  Programming  Method 

Suppose  that  we  know  valid,  integral  lower  and  upper  bounds,  v  and  v  re¬ 
spectively,  on  v*  =  vx*  where  x*  is  an  optimal  solution  to  SKP(IP).  Let 
V  =  {v,v  +  LfcS- .  ,  v}.  Since  all  data  are  integral,  SKP(IF)  and  the  following 
problem  are  equivalent: 


p*  =  min  min  (c  —  fix)/\/v 

vev  x 

s.t.  vx  =  v  (6) 

x  £  X. 

For  fixed  v,  the  objective  function  in  (6)  is  minimized  when  fix  is  maximized. 
Therefore,  (6)  can  be  solved  by  solving 

max  fix 

X 

s.t.  vx  =  v  (?) 

x  £  X 


to  obtain  solutions  x'v  for  each  v  £  V .  Then,  p*  =  minvey  (c  —  fix'v)/y/v, 
and  any  solution  x'v,  v  £  V,  which  satisfies  p*  =  (c  —  fix'v)/y/v  is  an  optimal 
solution  to  (4). 

Applying  the  above  methodology  to  SKP(W),  (7)  becomes 


KP(W,u) 


max 

X 

/XX 

S.t. 

wx 

< 

w 

vx 

= 

V 

X 

£ 

Z+ 

(8) 


which  is  just  a  two-constraint  IP  that  can  be  solved  with  reasonable  efficiency 
by  extending  the  standard  DP  algorithm  for  the  simple  knapsack  problem.  (A 
text  such  as  Dreyfus  and  Law  [7,  pp.  108-110]  describes  the  basic  recursion  and 
algorithm;  Weingartner  and  Ness  [26]  and  Nemhauser  and  Ullman  [18]  solve 
knapsack  problems  with  multiple  constraints  using  DP.)  Described  below  is  a 
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scheme  for  solving  SKP(IF),  based  on  solving  a  family  of  problems  of  the  form 
KP(IF,  v),  by  DP. 

Let  f(w,  v)  denote  the  optimal  solution  value  to  KPE(w,  v)  which  is  KP (W,  v) 
except  that  wx  <  W  is  replaced  by  wx  =  w.  For  pairs  (w,  v)  that  yield  an 
infeasible  problem  KPE(w,v),  we  use  the  convention  that  f(w,v)  =  —  oo.  The 
first  phase  of  the  following  algorithm  recursively  determines  f{w,  v )  for  w  £ 
{w,  w  +  1, . . . ,  W},  and  v  £  { v ,  v  +  1, . . . ,  u}  where  w  =  min*.  wk,  v  =  rninfc  vk, 
and  v  =  rnaxfc  [vkW/wk\  ■  (The  floor  operator,  |_-J ,  yields  the  greatest  inte¬ 
ger  that  does  not  exceed  its  argument.  Tighter  bounds  on  v*  are  possible, 
but  these  choices  of  v  and  v  suffice.)  Now,  define  SKPE(u;)  as  SKP(TE)  but 
with  the  constraint  wx  <  W  replaced  by  wx  =  w.  The  second  phase  of  the 
algorithm  determines  the  optimal  objective  value  p{w)  to  SKPE(w)  for  each 
w  £  {w,  w  +  1, . . . ,  W};  all  possible  values  of  v  are  examined  to  do  this,  for 
each  value  of  w.  (Values  of  w  <  w  are  ignored  since  x*  =  0  is  trivially  optimal 
in  such  cases.)  Finally,  the  third  phase  extracts  the  optimal  solution  x*(w)  to 
SKP(w)  for  each  w  £  {w,  w  +  1, . . . ,  W}.  This  is  simply  the  the  best  solution 
to  SKPE(w')  over  all  w'  £  {w,  w  +  1, . . . ,  w}. 

Algorithm  DPSKP 

Input:  Integer  data  for  SKP(IV)  with  K  item  types:  w,  fi,  v,  c,  W  >  rninfc  Wk- 
Output:  Optimal  solution  x*(w)  and  solution  value  p*(w )  to  SKP(u>)  for  all 
w  £  {rninj;  wk,  ■  ■  -  ,W}. 

{ 

/*  Phase  1  */ 

w  £-  minfc  wk;  v  £-  rninfc  vk\  v  £-  maxfc  \ykW/wk\ ; 

f(w,  v)  < - oo  V  ( w ,  v)  with  w— maxfc  wk  <  w  <  W,  v—maxk  vk  <  v  <  v ; 

/(0, 0)  t—  0; 

For  (w  =  w  to  W  and  v  =  v  to  v)  { 

k(w,  v)  <-  argmaxfce{1  [f(w  -  wk,v  -  vk)  +  pk}; 

f  (tC,  u)  i  f  ( W  W k(w,v )  5  ^  Vk(w,v))  “f  Pk(w,v)  5 

} 

/*  Phase  2  */ 

For  (w  =  w  to  W)  { 

v'  «-  argmin„6{?,  ^}(c  -  f(w,v))/y/v; 

p(w)  ■£-  (c  —  f(w,  v'))/y/v';  k(w)  ■£-  k{w ,  v')\ 
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w(w)  <-  argmin„,,e{w t...iW}p(w'); 

} 

/*  Phase  3  */ 

For  (w  =  w  to  W)  { 

x  i —  0;  w  4—  w(w);  } 

While  (w  ^  0)  {  xk(w)  4-  xk ^  +1;  w  <- w  -  wk^y,  } 

Print  {“Solution  to  SKP(u>)  for  u>=”,w;,“is  x*(w)  — ’,x}; 
Print{“with  optimal  objective  value  p*(w)  =”  ,p(w(w))}; 

} 

} 

To  test  the  algorithm,  the  data  from  Steinberg  and  Parks  [24]  is  used  to 
create  28  SKPs,  one  for  each  W  €  {3,..., 30},  and  we  compare  our  results 
against  the  most  recent  computational  work  on  these  SKPs  in  Carraway  et  al. 
[4] .  The  data  describe  a  small  stochastic  knapsack  problem  with  c  =  30  and  ten 
items  with  weights,  means,  and  variances  in  the  following  ranges:  3  <  wk  <  12, 
4  <  Mfc  <  16,  and  8  <  vk  <  25.  DPSKP  was  programmed  in  Turbo-Pascal 
as  in  [4]  but  run  on  a  faster  personal  computer,  a  Dell  Latitude  Xpi  laptop 
computer  with  40  megabytes  of  RAM  and  a  133  MHz  Pentium  processor.  A 
modest  number  of  enhancements  are  made  in  the  algorithm  for  efficiency’s  sake. 
For  instance,  v  is  made  a  function  of  w  via  v(w)  =  ma xk[vkw/wk\-  The  total 
solution  time  for  the  algorithm  (for  all  values  of  W  between  3  and  30)  is  0.026 
seconds,  which  includes  printing  the  solution  but  excludes  time  necessary  for 
input.  This  compares  to  a  solution  time  (on  an  IBM  PS/2  Model  50)  of  114.15 
seconds  reported  in  [4]  for  all  28  problems  and  a  solution  time  of  14.11  seconds 
for  the  single  hardest  problem  (W  =  30).  (The  method  of  [4],  although  partially 
based  on  DP,  does  not  solve  SKP(w)  sequentially  for  increasing  values  of  w. 
Thus,  we  report  the  sum  of  their  solution  times  for  all  W  £  {3, . . . ,  30}  as  well 
as  the  time  for  W  =  30.) 

Solution  times  for  the  Steinberg-Parks  data  can  be  reduced  by  taking  ad¬ 
vantage  of  the  fact  that  v  is  large  compared  to  p,  an  analogous  integral  upper 
bound  on  /xx*.  Let  p  be  a  lower  bound  on  /xx*  and  let  U  =  {/x,  p  +  1, . . . ,  p}. 
The  optimization  of  SKP(fF)  can  then  be  rearranged  to 

p*  =  min  min  (c-u)/\/vx 
neu  x 

s.t.  /xx  =  p 


x  £  X. 


(9) 
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For  fixed  /x  >  c,  the  objective  is  minimized  when  vx  is  minimized,  but  if  /x  <  c, 
the  objective  is  minimized  when  vx  is  maximized.  Thus,  there  are  two  cases 
to  handle  (fx  =  c  is  a  simple  special  case  we  ignore):  If  (9)  is  feasible  for  fx  >  c, 
we  redefine  the  lower  bound  as  /x  =  c  +  1  and  for  all  values  of  /x  G  U,  solve 

min  vx 

X 

MIN(^)  s.t.  /xx  =  /x  (10) 

x  G  X 

for  x/.  Otherwise,  we  redefine  the  upper  bound  as  fx  =  c  —  1  and  for  all 
fx  G  U  solve  MAX(/x)  for  x/ .  where  MAX(/x)  is  MIN(/t)  with  “max”  replacing 
“min.”  Then,  x*  G  argmin  gt7(c  —  /xx(i)/y/vxj)’.  (Note  that  it  is  possible 
to  determine  which  case  must  be  considered  first  by  solving  maxxex  /xx  and 
observing  whether  or  not  the  solution  value  exceeds  c.) 

The  above  idea  is  easily  specialized  to  SKP(fF).  The  most  computationally 
expensive  part  of  the  modified  algorithm  will  be  the  analogs  of  Phase  1,  one 
where  we  obtain  the  solution  value  f(w,  fx)  by  maximizing  vx  subject  to  wx  = 
w,  /xx  =  /x  and  x  G  Zr;  ,  and  the  other  where  we  obtain  f(w,  /x)  by  minimizing 
vx  subject  to  the  same  constraints.  This  work  will  be  roughly  proportional  to 
fxW  +  (c  —  l)u>  where  w  is  the  largest  value  of  w  for  which  there  is  no  feasible  x 
with  /xx  >  c,  wx  <  w,  x  G  Z? .  The  total  work  is  therefore  no  worse  than  2 fxW, 
versus  the  work  in  DPSKP  which  is  proportional  to  vW.  For  the  test  data  set, 
fx  =  max*.  \_fXkW/wk\  =  68  and  v  =  max*.  \yj,W/wf.\  =  266.  Thus,  we  would 
expect  the  modified  algorithm  to  require  1/4  to  1/2  the  work  of  DPSKP.  This 
expectation  is  realized  by  a  solution  time  of  0.009  seconds,  excluding  input. 

Several  final  comments  should  be  made  on  the  basic  methodology  of  DP¬ 
SKP.  The  algorithm  is  easy  to  program  and  computer  memory  requirements 
are  modest:  The  Steinberg-Parks  problems  require  less  than  0.1  megabytes  of 
RAM.  DPSKP  is  easily  extended  to  bounded  variables  by  solving  the  bounded- 
variable  version  of  SKPE(TT)  which  is  is  just  a  two-constraint,  bounded- variable 
knapsack  problem.  (Dantzig  [6]  solves  the  bounded-variable  knapsack  problem; 
Nemhauser  and  Ullman  [18]  and  Weingartner  and  Ness  [26]  solve  multiple- 
constraint  knapsack  problems.)  Furthermore,  a  bounded-variable  algorithm 
could  be  easily  modified  to  handle  the  dependent  (perfectly  correlated)  case  of 
SKP,  problem  (3). 

2.2  Integer  Programming  Methods 

The  question  raised  and  answered  in  this  section  is  “Are  specialized  codes  nec¬ 
essary  to  solve  the  SKP?”  It  is  shown  here  that  SKP(TT)  is  readily  solved 
using  off-the-shelf  integer  programming  tools,  i.e.,  an  algebraic  modeling  lan¬ 
guage  and  a  linear-programming-based  branch-and-bound  solution  algorithm. 
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Instead  of  hours  of  programming  and  a  fraction  of  a  second  of  execution  time, 
solutions  can  be  obtained  with  minutes  of  programming  and  a  few  seconds  of  ex¬ 
ecution  time.  All  of  the  techniques  developed  are  actually  applicable  to  general 
problems  in  the  form  of  (4)  and  are  described  as  such.  However,  computational 
testing  is  only  performed  on  the  Steinberg-Parks  problems. 


2.2.1  A  Simple  Linearization.  One  of  the  simplest  approaches  to  solving 
(4)  via  integer  programming  is  to  linearize  the  objective  by  taking  its  logarithm. 
The  appropriate  linearization  depends  on  the  sign  of  c  —  fix* :  We  first  solve 
/2  =  maxxexMx  and  obtain  solution  x'.  By  observing  that  c  —  fix*  and  c  —  p 
have  the  same  sign,  the  problem  may  be  separated  into  three  cases.  In  case 

(a) ,  c  =  p  and  x'  is  optimal  for  (4).  The  following  discussion  considers  case 

(b)  where  c>p;  the  linearization  for  case  (c)  where  c  <  p  is  then  a  symmetric 
modification  of  case  (b). 

In  case  (b),  a  logarithmic  linearization  yields 

P  1  V 

min  V(log(c  -  i))hi  -  -  V(log  j  -  log (j  -  1  ))dj 

h.d.xeX  *■ — J  i  L — ' 

%—\i  3=v 

LINl(b)  - 

s.t.  ^ ihi  =  fix 

i—fi 


Yji-  =  1 

i—fi 


(ii) 


=  vx 

G  {0, 1}  for  i  =  fi, . . . .,  p 


0  <  dj  <1  for  j  =  v  +  1 , . . . ,  v 

dj  =  1  for  j  =  1, . . . ,  v. 


When  fix  =  i',  hi'  =1  and  hi  —  0  for  all  i  ^  i' and  when  vx  =  j' ,  it  follows 
that  dj  =  1  for  j  =  1, . . .  ,j'  and  dj  =  0  for  j'  >  j.  Although  dj  is  allowed 
to  be  continuous,  it  will  be  binary  in  an  optimal  solution  since  vx  is  integer, 
—  (logj  —  log(j  —  1))  is  an  increasing  function  in  j,  and  since  the  objective 
function  is  being  minimized. 

We  have  formulated  LINl(b)  in  the  algebraic  modeling  language  GAMS  [1] 
and  solved  the  Steinberg-Parks  problems,  for  appropriate  values  of  W,  using 
the  mixed- integer  programming  solver  XA  [27].  We  use  the  same  Dell  laptop 
computer  as  in  the  previous  section.  The  bound  parameters  used  are  fi  = 
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W  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18 

Total  sec.  .14  .16  .19  .12  .26  .23  .26  .47  .09  .60  .36  .36  .37  .43  .13  .22 


Table  1  Solution  times  for  model  LINl(b) 


min*,  Hk,  p  =  c  —  1,  v  =  min*,  Vk,  v  =  max*.  \ykW/wk\  •  The  problems  had  122 
variables  although  some  of  these  were  fixed.  Table  1,  lists  the  solution  times 
(reported  as  “Resource  Utilization”  in  the  GAMS  output)  for  the  Steinberg- 
Parks  problems  for  all  W  €  {3, . . . ,  18}  (for  which  fix*  <  c).  Tighter  bounds  on 
v,  v,  fi,  and  p  can  reduce  the  number  of  decision  variables  and  speed  solution 
time,  but  we  pursue  this  issue  in  the  next  section. 

The  linearization  for  the  case  where  fix*  >  c  is  analogous  to  LINl(b)  and 
is  straightforward:  The  roles  of  hi  and  dj  are  reversed  in  that  the  hi  become 
continuous  between  0  and  1,  the  dj  are  binary,  dj  =  1  implies  vx  =  j,  and 
hi  =  1  implies  fix  >  i.  The  objective  function  to  be  linearized  and  maximized 
is  ( fix  —  c)/\/vx.  Initial  tests  with  this  case  were  not  as  successful  as  LINl(b) 
at  least  partially  because  v  is  always  larger  than  p  and  there  are  many  more 
binary  variables.  Rather  than  trying  to  improve  this  linearization  for  this  case, 
another  rather  different  linearization  is  developed  and  tested  next. 

2.2.2  Another  linearization.  The  linearization  described  below,  for  case 
(c)  where  fix*  >  c+1,  uses  binary  variables  to  enumerate  all  possible  values  for 
fix*  >  c+1  and  vx*  >  v.  By  solving  a  few  auxiliary  problems,  the  enumeration 
required  is  not  burdensome,  at  least  for  the  Steinberg-Parks  problems.  The 
method  is  described  only  for  case  (c)  but  with  minor  modifications  can  also  be 
used  for  case  (b). 

For  values  of  i  and  j  such  that  c  +  1  <  i  <  p  and  v  <  j  <  v,  define  the 
binary  variable  yij  to  be  1  if  fix  =  i  and  vx  =  j,  and  to  be  0  otherwise.  Also, 
define  Pij  =  (c  —  i)/yfj-  Then,  (4)  is  equivalent  to 


STOCHASTIC  KNAPSACK  PROBLEM  159 


LIN2(c) 


p*  =  min 

xex,y 

s.t. 


PijVij 

(ij)eu 

Y  Wi 

( i,j)eu 

Y  MU 

( i,j)eu 

Y  ya 

(i,j)eu 

Vij 


=  /XX 

=  vx 

=  1 

G  {0,1}  V  (i,j)  G  IJ, 


(12) 


where  I  =  {c  +  1, . . . ,  p},  J  =  {«,...,«}  and  IJ  =  I  x  J.  Like  the  logarithmic 
linearization  of  Section  2.2.1,  (12)  requires  the  addition  of  only  three  structural 
constraints,  but  the  potential  number  of  binary  variables  is  much  larger.  The 
required  number  of  variables  can  be  reduced  drastically,  however,  by  solving 
a  sequence  of  auxiliary  problems  to  find  tight  values  for  p,  v,  v,  and  another 
bound  p  >  p*.  (Any  elements  (i,j)  £  IJ  with  pij  >  p  are  deleted.)  The 
four-part  procedure  described  next  for  solving  LIN2(c)  has  proven  successful  in 
practice: 


Step  (1)  Establish  p  by  finding  a  “good”  feasible  solution  to  (4):  We  solve 
a  simplification  of  (4)  with  a  linear  objective,  minx6jv'  sx,  to  obtain  x{ ,  where 
X'  =  X  fl  {/xx  >  c  +  1}  and  Sk  =  y/vk  —  14- ■  Then,  p  =  (c  —  /xx'1)/y/vx[. 


Step  (2)  Establish  /x  and  v:  Solve  maxxeA-/xx  to  obtain  Xj  and  let  p  = 
and  v  =  VX2.  The  variance  bound  is  valid  since 


c  —  /xx  c  —  /xx',  ,  ,  ^  „  .  , -  , — - 

—  <  —  —  and  c  —  /XX2  <  c  —  /xx  <  0  imply  V  vx*  <  y'vXj. 


Additionally,  if  X2  is  a  better  solution  to  (4)  than  is  x{,  p  is  reduced  to 
(c  -  /xx!,)/ y/vx!/ 


Step  (3)  Establish  v:  Solve  minxex'  vx  to  obtain  X3,  where  X'  =  X  fl  {/xx  > 
c  +  1}.  Let  v  =  VX3  and  update  p  if  X3  is  a  better  solution  for  (4)  than  are  x{ 
and  x!,. 


Step  (4)  Solve  LIN2(c):  After  the  three  auxiliary  problems  are  solved  and 
good  values  for  /x,  v,  v  and  p  are  established,  a  “tight”  version  of  LIN2(c)  is 
then  solved. 
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w 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

Step  (1)  sec. 

.07 

.08 

.08 

.06 

.08 

.06 

.07 

.08 

.09 

.07 

.08 

.08 

Step  (2)  sec. 

.08 

.08 

.04 

.08 

.09 

.07 

.07 

.08 

.09 

.05 

.08 

.11 

Step  (3)  sec. 

.08 

.06 

.07 

.08 

.10 

.08 

.09 

.08 

.08 

.06 

.06 

.07 

LIN2(c)  sec. 

.08 

.11 

.06 

.05 

.09 

.08 

.11 

.17 

.21 

.10 

.20 

.23 

Total  sec. 

.31 

.33 

.25 

.27 

.36 

.29 

.34 

.41 

.47 

.28 

.42 

.49 

Table  2  Solution  statistics  for  LIN2(c)  and  auxiliary  problems. 


The  four-part  procedure  described  above  was  tested  on  the  Steinberg-Parks 
problems  for  W  €  {19, . . . ,  30}  for  which  p*  <  0.  Table  2  displays  the  solution 
times  of  the  individual  auxiliary  problems  and  LIN2(c)  for  each  relevant  value 
of  W. 

The  auxiliary  problems  did  make  a  significant  difference  in  problem  size  and 
solution  time  for  LIN2(c).  LIN2(c)  contains  from  13  to  316  variables  as  solved, 
and  total  solution  time  never  exceeds  one  half  second.  When  we  try  to  solve 
LIN2(c)  without  the  auxiliary  problems  (using  more  easily  calculated  bounds), 
problems  sizes  range  from  250  to  1824  variables  and  some  run  times  exceed  30 
seconds. 

So,  the  IP  approach  yields  solutions  reasonably  quickly  and  the  programming 
effort  is  minimal  even  though  a  number  of  auxiliary  problems  may  need  to 
be  solved.  The  approach  does  not  really  depend  on  the  form  of  the  model’s 
constraints,  so  it  is  much  more  flexible  than  DP.  However,  both  the  IP  and 
DP  approaches  require  that  returns  be  independent  normal  random  variables. 
General  return  distributions  with  an  arbitrary  dependency  structure  are  allowed 
in  the  Monte  Carlo  method  we  develop  in  the  rest  of  the  paper. 


3  SKP  WITH  GENERAL  RETURN  DISTRIBUTIONS 

In  this  section,  we  consider  (1),  which  for  convenience  we  restate  here  as 


2 


* 


max  P  (rx  >  c) 
s.t.  xfl, 


(1) 


where  r  is  a  random  vector  with  a  general  distribution.  Thus,  r  may  be  non¬ 
normal  and  may  have  dependent  components.  In  the  context  of  the  stochastic 
knapsack  problem  with  returns  that  are  independent  both  between  and  within 
item  types,  (1)  specializes  to  (2)  with  r^,  k  =  1, . . . ,  K,  l  £  Ck,  independent. 
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And,  when  returns  are  perfectly  correlated  within  an  item  type  but  independent 
between  types,  (1)  specializes  to  (3)  with  r*,,  k  =  1 ,K,  independent.  We 
will  consider  these  two  special  cases  in  our  computational  work,  but  we  develop 
the  Monte  Carlo  solution  procedure  in  the  more  general  context  of  (1),  without 
independence  assumptions  on  the  components  of  r. 

When  stochastic  optimization  problems  such  as  (1)  do  not  have  a  special 
structure  such  as  normally  distributed  returns  (see  Section  2),  it  is  usually  nec¬ 
essary  to  resort  to  approximation  procedures  in  order  to  solve  the  problem, 
approximately.  One  common  approach  is  to  replace  the  “true”  distribution 
of  the  random  vector  r  with  an  approximating  distribution  that  is  more  man¬ 
ageable  from  a  computational  perspective;  see  Wets  [25,  §6].  A  Monte  Carlo 
procedure  that  generates  independent  and  identically  distributed  (i.i.d.)  ob¬ 
servations,  rJ,  j  =  1  ,...,m,  from  the  distribution  of  r  may  be  viewed  from 
this  perspective:  These  observations  (which  we  will  also  refer  to  as  scenarios ) 
are  the  realizations  of  an  m-point  empirical  approximating  distribution.  As  we 
will  show,  modest  values  of  m  can  yield  computationally  tractable  optimization 
models  that  provide  good  approximations  of  SKP. 

Let  /(•)  be  the  indicator  function  that  takes  on  the  value  1  if  its  argument 
is  true,  and  is  0  otherwise.  With  this  notation, 


P  (rx  >  c)  =  El  (rx  >  c)  =  E 


■  /  (rJx  >  c) 


i= i 


Thus,  the  approximating  problem  based  on  an  empirical  distribution  is 

^  m 

Um  =  max  —  >  I  (rJx  >  c) 

x  m  Z — '  v  ' 


i=i 

s.t.  x  e  x. 


(13) 


By  observing  that 


z*  =  maxP  (rx  >  c)  =  max£  —  ,  /  (rJx  >  c) 

XSA-  V  “  ’  XSA  lm  ^  =  1  V  “  ’ 


<  E 


max  A  £™i  I  (r-'x  >  c) 


(14) 


—  EUm, 


we  see  that  Um  is  an  upper  bound,  in  expectation,  on  the  optimal  solution 
value  z *;  see  Mak  et  al.  [14]. 

Estimates  of  EUm  are  valuable  in  ascertaining  the  quality  of  a  feasible  can¬ 
didate  solution  x  €  X.  We  may  estimate  the  objective  value,  P  (rx  >  c),  via 


Lm  =  —  V'/(rJx  >  c). 
m  f 
3= i 
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Because  x  is,  in  general,  suboptimal,  ELm  =  P  (rx  >c)<z*.  As  we  show 
below,  estimates  of  the  upper  bound  EUm  can  be  used  to  bound  the  optimality 
gap,  z*  —  P  (rx  >  c). 

We  generate  a  candidate  solution  x  £  X  by  solving  a  single  approximating 
problem  of  the  form  (13).  It  is  clearly  desirable  to  ascertain  the  quality  of  such 
a  solution,  and  to  do  so  we  follow  Mak  et  al.  [14].  This  procedure  consists  of 
using  the  method  of  batch  means  to  construct  a  one-sided  confidence  interval 
of  the  optimality  gap,  z*  —  P  (rx  >  c),  by  forming  i.i.d.  observations  of 


G  m  —  G rll  Lr„  —  max 
xex 


IIL 

—  V  J(r*x  >  c) 
m  '  v 

3  = 1 


in 

—  V  I  (rjx  >  c) 
m  '  v  ; 


Since  EUm  >  z*  and  ELm  =  P  (rx  >  c),  it  follows  that  EGm  >  z*  — 
P  (rx  >  c).  Hence,  we  may  use  multiple  observations  of  Gm  to  construct  point 
and  interval  estimates  for  the  optimality  gap. 

The  upper  and  lower  bound  estimators  that  define  Gm  use  the  same  stream 
of  random  numbers  rJ,  j  =  1, . . . ,  m;  this  use  of  common  random  numbers  is  a 
well-known  variance  reduction  technique.  (See,  for  example,  Law  and  Kelton 
[13,  §11.2]  for  a  general  discussion  of  common  random  numbers;  for  computa¬ 
tional  results  in  stochastic  programming,  see  Mak  et  al.  [14].)  In  our  current 
setting,  common  random  numbers  have  the  additional  benefit  of  ensuring  non¬ 
negative  estimates  of  the  optimality  gap,  since,  by  construction  Gm  >  0;  this 
could  not  be  guaranteed  if  Um  and  Lm  were  estimated  separately  with  distinct 
random  number  streams.  Before  summarizing  our  Monte  Carlo  procedure  for 
approximately  solving  SKP,  we  turn  to  the  issue  of  evaluating  Lm  and  Um. 

Evaluating  Lm  is  straightforward:  Given  x,  we  generate  rJ,  j  =  1 
and  for  each  observation  simply  test  whether  or  not  rJx  >  c  and  compute 
Lm  =  V(rJx  >  c). 

To  calculate  Um,  we  convert  (13)  into  the  following  equivalent  IP 


max 

x,y 

3  = 1 

s.t. 

x  e  A 

Cx  >  cyj  -  Mj(l  -  yj)  Vj  =  1, . 

. . ,  m 

y  G  {0,l}m. 

(15) 


Here,  Mj  >  0  is  large  enough  to  ensure  that  rJx  >  c  —  M3  ( 1  —  )  is  a  vacuous 

constraint  when  yj  =  0. 

The  Monte  Carlo  Procedure  for  solving  SKP  begins  by  solving  an  empirical 
approximating  problem  (15)  with  m'  scenarios  to  generate  a  candidate  solution 
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x.  Then,  we  use  the  method  of  common  random  numbers,  with  a  batch  size 
of  m,  to  construct  an  approximate  (1  —  a)-level  confidence  interval  on  the 
optimality  gap,  z*  —  P  (rx  >  c).  In  practice,  we  typically  choose  m!  larger 
than  m  in  an  attempt  to  find  a  good  candidate  solution. 

Procedure  MCSKP 

Input:  Data  for  SKP  with  K  items:  w,  c,  W,  and  distribution  for  r.  Batch 
size  m,  sample  size  (number  of  batches)  n,  and  size  of  approximating  problem 
to  generate  candidate  solution,  m! .  Confidence  interval  level  1  —  a  and  za 
satisfying  P(N( 0, 1)  <  za)  =  1  —  a. 

Output:  Solution  x,  approximate  (1  —  a)-level  confidence  interval  [0,  Q{n)  +  eG] 
on  the  optimality  gap. 

{ 


/*  Generate  Candidate  Solution  */ 

Generate  r1, . . . ,  rm  i.i.d.  from  the  distribution  of  r; 


x  <—  argmax 

x  6  A 


/*  Optimality  Gap  Calculations  */ 


For  (i  =  1  to  n)  { 

Generate  r*1, . . .  ,rlm  i.i.d.  from  the  distribution  of  r; 


Glm  max 

xgA 


£  T?=  1 1  (*ij*  >c)  E7=  1  /  (r«*  >  c) ; 


} 

G(n)  <-  ^  E"=1 

S2G{n)^^ tELi  [Gin-Gin)]2-, 


eG  <-  zaSG(n)/^n; 


Print{ “Approximate  solution  to  SKP:”,x}; 

Print{ “Confidence  interval  on  the  optimality  gap:”,[0,  G(n)  +  £g]}; 

} 

The  MCSKP  procedure  was  implemented  in  GAMS  [1]  and  the  IPs  solved 
using  CPLEX  Version  3.0  [3].  All  computational  tests  in  this  section  were  per¬ 
formed  on  an  IBM  RS-6000  Model  590  computer  with  512  megabytes  of  RAM. 
Because  we  already  know  optimal  solutions  to  the  Steinberg-Parks  problems, 
and  can  perform  exact  evaluations  of  P(rx  >  c)  for  candidate  solutions  x, 
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w 

G(n) 

EG 

95%  Cl 

P(rx  >  c) 

2* 

CPU  (min. 

10 

0.006 

0.003 

[0,0.009] 

0.014 

0.014 

19.4 

15 

0.072 

0.010 

[0,0.082] 

0.124 

0.173 

26.9 

20 

0.052 

0.010 

[0,0.062] 

0.549 

0.588 

32.7 

25 

0.020 

0.007 

[0,0.027] 

0.915 

0.915 

25.6 

30 

0.025 

0.005 

[0,0.030] 

0.978 

0.995 

19.7 

Table  3  Results  of  the  Monte  Carlo  solution  procedure  for  the  Steinberg-Parks  SKPs. 
Returns  are  normal  random  variables  that  independent  between  and  within  item  types.  In 
these  computations  m!  =  200  (candidate  generation),  m  =  100  (batch  size),  and  n  =  30 
(number  of  batches). 


we  can  make  some  interesting  observations  regarding  the  performance  of  the 
Monte  Carlo  solution  procedure  from  Table  3.  In  two  of  the  five  cases,  the  x 
found  by  solving  the  empirical  problem  with  m!  =  200  scenarios  is  optimal.  By 
definition,  the  approximate  95%  confidence  interval  achieves  the  desired  cover¬ 
age  provided  that  z*  —  P{ rx  >  c)  falls  within  the  interval.  For  example,  when 
W  =  20,  z*  —  P{ rx  >  c)  =  0.039  falls  in  [0,0.062].  Table  3  indicates  that  the 
desired  coverage  is  achieved  in  each  of  the  five  cases.  In  fact,  in  each  case  the 
optimality  gap  is  smaller  than  the  point  estimate  G(n);  this  is  not  surprising 
since  EG(n )  >  2*  —  P{ rx  >  c).  Because  the  point  estimate  of  the  gap  is  biased 
in  this  manner,  we  tend  to  obtain  conservative  confidence  interval  statements 
(a  caveat  to  this,  due  in  part  to  the  discrete  nature  of  the  integer  SKPs,  is 
discussed  below).  We  note  that  when  W  =  30,  the  confidence  interval  provides 
an  effectively  vacuous  statement  since  the  probability  of  achieving  the  target 
is  within  0.03  of  1.  (The  MCSKP  procedure  must  be  applied  with  some  care, 
if  at  all,  when  P>(rx  >  c)  is  close  to  0  or  1.) 

The  primary  goal  of  the  MCSKP  procedure  is  to  obtain  a  solution  x  of 
“high  quality”  and  to  make  a  probabilistic  statement  concerning  this  quality. 
The  procedure  does  not  include  a  point  estimate  of  P(rx  >  c)  because  we 
regard  this  of  secondary  importance  relative  to  obtaining  an  x  of  high  quality. 
Of  course,  a  point  estimate  is  straightforward  to  compute,  if  desired. 

In  order  to  study  the  effect  of  the  number  of  scenarios  m!  on  the  quality  of  the 
candidate  solution,  x,  we  took  the  problem  with  the  poorest  solution  (widest 
optimality  gap)  from  Table  3  {W  =  15)  and  ran  the  Monte  Carlo  procedure 
for  various  values  of  m' .  The  results  are  summarized  in  Table  4.  To  reduce 
the  variability  due  to  sampling,  the  candidate-generation  and  optimality-gap- 
estimation  phases  of  the  MCSKP  procedure  were,  respectively,  initialized  with 
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m! 

G(n) 

CG 

95%  Cl 

P(  rx  >  c) 

CPU  (min. 

50 

0.091 

0.010 

[0,0.101] 

0.102 

25.8 

100 

0.091 

0.010 

[0,0.101] 

0.102 

26.0 

200 

0.072 

0.010 

[0,0.082] 

0.124 

26.9 

300 

0.040 

0.010 

[0,0.051] 

0.159 

28.2 

400 

0.026 

0.009 

[0,0.035] 

0.173 

32.1 

500 

0.026 

0.009 

[0,0.035] 

0.173 

36.8 

600 

0.026 

0.009 

[0,0.035] 

0.173 

43.4 

Table  4  We  illustrate  the  quality  of  the  candidate  solution  generated  by  solving  empirical 
approximating  problems  for  the  SKP  with  W  =  15  for  various  batch  sizes  m! .  This  problem 
has  z*  =  0.173.  For  constructing  the  confidence  intervals  we  use  m  =  100  and  n  =  30. 
The  CPU  times  are  for  the  entire  MCSKP  procedure. 


the  same  seeds  for  generating  pseudo-random  variates  for  each  value  of  m! . 
This  has  two  effects:  First,  when  increasing  m'  from,  say,  300  to  400  we  have 
simply  added  100  additional  scenarios  to  the  original  300.  Second,  when  the 
candidate-generation  phase  finds  the  same  x  for  different  values  of  m!  (i.e., 
m'  =  50, 100  and  m'  =  400,  500,  600)  the  gap-estimation  results  are  identical. 
Note  that  m!  —  400,  500,  and  600  all  yield  an  optimal  solution. 

As  Tables  3  and  4  indicate,  even  when  the  candidate-generation  phase  finds 
an  optimal  solution,  we  still  obtain  confidence  intervals  with  widths  ranging 
from  0.009  to  0.035.  There  are  two  reasons  for  this:  First,  there  is  a  contribution 
due  to  G[n)  that  originates  from  the  inequality  in  (14),  obtained  by  exchanging 
the  optimization  and  expectation  operators.  Second,  there  is  a  contribution  due 
to  sampling  error  which  is  captured  in  eo-  Table  5  shows  a  decrease  in  both 
these  terms  as  the  batch  size  m  grows.  In  fact,  it  is  possible  to  show  that  EUm 
decreases  monotonically  in  m  [17,  14].  The  increase  in  CPU  times  with  larger 
batch  sizes  in  Table  5  (and  to  a  lesser  extent  in  Table  4)  is  due,  in  part,  to  the 
IP  (15)  becoming  larger.  But,  the  IP  optimality  gap  must  be  shrunk  to  a  value 
less  than  1/m  to  ensure  optimality,  and  this  also  results  in  increasing  times. 

As  indicated  in  Section  1,  certain  systems  lead  to  SKPs  in  which  the  returns 
within  (as  well  as  between)  item  types  are  not  independent.  Table  6  summarizes 
computational  results  for  a  variant  of  the  Steinberg-Parks  problems  in  which  the 
returns  are  normally  distributed  and  independent  between  item  types  but  are 
perfectly  correlated  within  each  type.  Because  the  number  of  integer  variables 
in  (15)  is  significantly  smaller  than  for  the  independent  case,  the  computational 
effort  is  significantly  less  for  this  model. 
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m 

G(n) 

EG 

95%  Cl 

CPU  (min.) 

25 

0.075 

0.020 

[0,0.095] 

24.6 

50 

0.056 

0.012 

[0,0.068] 

26.1 

100 

0.026 

0.009 

[0,0.035] 

32.1 

200 

0.019 

0.006 

[0,0.025] 

41.7 

300 

0.010 

0.004 

[0,0.014] 

115.2 

400 

0.008 

0.003 

[0,0.011] 

208.4 

Table  5  We  illustrate  the  effect  of  the  batch  size  m  on  the  tightness  of  the  confidence 
interval  by  applying  the  MCSKP  procedure  to  the  SKP  with  W  —  15  for  an  optimal 
candidate  solution.  We  use  a  sample  size  of  n  =  30.  The  CPU  times  include  the  time 
required  to  solve  the  m  =  400  scenario  problem  to  find  an  optimal  candidate  solution. 


W 

G(n) 

EG 

95%  Cl 

P(rx  >  c) 

CPU  (min. 

10 

0.000 

0.000 

[0,0.000] 

0.090 

3.2 

15 

0.000 

0.001 

[0,0.001] 

0.327 

6.2 

20 

0.021 

0.007 

[0,0.028] 

0.561 

9.2 

25 

0.017 

0.006 

[0,0.023] 

0.872 

5.7 

30 

0.016 

0.005 

[0,0.021] 

0.973 

2.8 

Table  6  Results  of  the  Monte  Carlo  solution  procedure  for  SKPs  with  normal  returns  that 
are  independent  between,  but  perfectly  correlated  within  item  types.  In  these  computations 
ml  =  200  (candidate  generation),  m  =  100  (batch  size),  and  n  =  30  (number  of 
batches). 


Note  that  in  Table  6  the  confidence  interval  width  is  actually  0  for  W  = 
10  and  is  0.001  for  W  =  15.  While  this  may  be  somewhat  disconcerting, 
when  W  =  10  each  of  the  n  =  30  empirical  problems  (m  =  100)  yielded  the 
same  solution  x  as  the  candidate-generation  problem  (to'  =  200).  And,  when 
W  =  15,  29  of  the  30  empirical  problems  generated  the  same  solution  x  as 
the  candidate-generation  problem  (to  four  digits;  G(n)  =  0.0003  for  this  case). 
Such  results  are  partly  due  to  the  discrete  nature  of  the  integer  SKP  and  would 
be  less  likely  to  occur  if  the  decision  variables  were  continuous,  particularly  if 
the  solutions  were  not  extreme  points  of  X. 

Finally,  Table  7  summarizes  the  computational  results  for  another  variant  of 
the  Steinberg-Parks  problems  in  which  distributions  of  the  returns  are  assumed 
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w 

G{n) 

EG 

95%  Cl 

CPU  (min.) 

10 

0.003 

0.002 

[0,0.005] 

19.3 

15 

0.030 

0.011 

[0,0.041] 

26.1 

20 

0.057 

0.015 

[0,0.072] 

32.4 

25 

0.020 

0.007 

[0,0.027] 

26.3 

30 

0.014 

0.004 

[0,0.018] 

19.4 

Table  7  Results  of  the  Monte  Carlo  solution  procedure  for  SKPs  with  uniformly  distributed 
returns  that  are  independent  between  and  within  item  types.  In  these  computations  m!  = 
200  (candidate  generation),  m  =  100  (batch  size),  and  n  =  30  (number  of  batches). 


to  be  uniform,  having  the  same  mean  and  variance  as  the  normal  distributions  of 
the  original  Steinberg-Parks  data.  Here,  the  returns  are  independent  between 
and  within  item  types.  In  this  case,  both  the  required  computational  effort 
and  the  magnitude  of  the  confidence  interval  widths  are  very  similar  to  that 
for  normally  distributed  returns  (see  Table  3). 

4  CONCLUSIONS 

This  paper  has  considered  stochastic  integer  programming  problems,  with  de¬ 
terministic  constraints,  where  the  objective  is  to  maximize  the  probability  of 
meeting  or  exceeding  a  certain  return  threshold.  We  have  developed  three  so¬ 
lution  procedures.  In  Section  2.1,  we  presented  a  new  dynamic-programming 
method  for  the  special  case  of  the  stochastic  knapsack  problem  with  normally 
distributed  returns  that  are  independent  between  and  within  item  types.  This 
method  is  conceptually  simple,  easy  to  program,  easy  to  modify  for  bounded 
variables,  and  significantly  faster  than  previously  available  procedures.  In  Sec¬ 
tion  2.2,  we  described  integer  programming  techniques  with  the  same  structure 
on  the  random  returns  but  with  more  general  constraint  sets.  We  used  two 
different  linearized  integer  programs  coupled  with  several  auxiliary  integer  pro¬ 
grams.  These  methods  were  tested  and  shown  to  be  effective.  Finally,  the 
Monte  Carlo  solution  procedure  of  Section  3  addressed  problems  under  very 
general  assumptions  regarding  the  distribution  of  the  vector  of  random  re¬ 
turns.  Due  to  the  more  general  problem  structure,  we  solved  an  approximating 
problem  whose  solution  quality  was  specified  only  in  a  probabilistic  sense.  Nev¬ 
ertheless,  our  computational  results  demonstrated  that  good  solutions  can  be 
obtained  with  modest  sample  sizes. 
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